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Abstract 

Conditional identity in distribution (Berti et al. (2004)) is a new type of dependence for ran- 
dom variables, which generalizes the well-known notion of exchangeability. In this paper, a class 
of random sequences, called Generalized Species Sampling Sequences, is defined and a condition to 
have conditional identity in distribution is given. Moreover, a class of generalized species sampling 
sequences that are conditionally identically distributed is introduced and studied: the Generalized 
Ottawa sequences (GOS). This class contains a "randomly reinforced" version of the Polya urn and of 
the Blackwell-MacQueen urn scheme. For the empirical means and the predictive means of a GOS, 
we prove two convergence results toward suitable mixtures of Gaussian distributions. The first one is 
in the sense of stable convergence and the second one in the sense of almost sure conditional conver- 
gence. In the last part of the paper we study the length of the partition induced by a GOS at time 
n, i.e. the random number of distinct values of a GOS until time n. Under suitable conditions, we 
prove a strong law of large numbers and a central limit theorem in the sense of stable convergence. 
All the given results in the paper are accompanied by some examples. 

Key-words: species sampling sequence, conditional identity in distribution, stable convergence, 
almost sure conditional convergence, generalized Polya urn. 



1 Introduction 

A sequence (X n ) n >i of random variables defined on a probability space (fi, A,P) taking values in a 
Polish space, is said a species sampling sequence if (a version) of the regular conditional distribution 
of X n +i given X(n) := (Xi, . . . ,X n ) is the transition kernel 

K n+1 {u,-) :=£fe=iP™.fc( w ) 5 *fe(aoO + Mw)M(-) (!) 
wh ere p n .k(-) and f n (-) are real-valued measurable functions of X(n) and \i is a probability measure. 



See 



Pitman 



(1996) 
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As explained in 



Hansen and Pitman 



Q2000j) , a species sampling sequence (X n ) n >i can be inter- 
preted as the sequential random sampling of individuals' species from a possibly infinite population 
of individuals belonging to several species. If, for the sake of simplicity, we assume that /i is diffuse, 
then the interpretation is the following. The species of the first individual to be observed is assigned 
a random tag X\ , distributed according to fi. Given the tags X\ ,.. . X n of the first n individuals 
observed, the species of the (n + l)-th individual is a new species with probability f n and it is equal 
to the observed species X k with probability Y^j=iPn,jI{Xj=X h }- 

The concept of species sampling sequence is naturally related to that of random partition induced 
by a sequence of observations. Given a random vector X(n) — (Xi, . . . , X n ), we denote by L n the 
(random) number of distinct values of X(n) and by X*(n) — (X*, . . . ,X1 n ) the random vector of 
the distinct values of X(n) in the order in which they appear. The random partition induced by X(n) 
is the random partition of the set {1, . . . ,n} given by 7r 



tt£ ] where 



■>"l„ 



Two distinct indices i and j clearly belong to the same block 7!"^ for a suitable k if and only if 
Xi = Xj . It follows that the prediction rule ([T]) can be rewritten as 

K n+1 (u, •) = Efc=i W) K,fc(w)<5jfj( w )(-) + f»(w)/x(-) 

where 



(2) 



Pn.k 



111 



Hansen and Pitman 



I 20001 1 it is proved that if /i is diffuse and (X n ) n >i is an exchangeable 
sequence, then the coefficients p* k are almost surely equal to some function of 7r' n ' and they must 
satisfy a suitable recurrence relation. Although there are only a few explicit prediction rules which 
give rise to exchangeable sequences, this kind of prediction rules are appealing for many reasons. 
Indeed, exchangeability is a very natural assumption in many statistical problems, in particular 
from the Bayesian viewpoint, as well for many stochastic models. Moreover, remarkable results 
are known for exchangeable sequences: among others, such sequences satisfy a strong law of large 
numbers and they c an be complet ely characterized by the well-known de Finetti representation 
theorem. See, e.g., lAldous I (| 1985! ) . Further, for an exchangeable sequence the empirical mean 
X]fe = i an d the predictive mean, i.e. E[f(X n +i)\Xi, . . . , X n ], converge to the same limit 
as the number of observations goes to infinity. This fact can be invoked to justify the use of the 
empirical mean in the place of the predictive mean, which is usually harder to compute. Nevertheless, 
in some situations the assumption of exchangeability can be too restrictive. For instance, instead of a 
classical Polya urn schem e, it may be usefu l to deal with the so ca l led ra n domly reinforced Polya urn 



sche me. See, for example. 



and 



Crimaldi 



May. Paganoni and Secchi 



1 20071 ), 



Crimaldi and Leisen 



(2008) ; 



Flournov and May 



(2008) 



(|2005l ) . Such a process fails to be exchangeable but it can be still 



described with a prediction rule which is not too far from Q, see Example 13.41 of the present paper. 
Our purpose is to introduce and study a class of generalized species sampling sequences, which are 
generally not exchangeable but which still have interesting mathematical properties. 
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We thus need to recall the notion of conditional identity in distribution, introduced and studied in 



Berti. Pratelli and Rigo 



1 2004) . Such form of dependence generalizes the notion of exchangeability 



preserving some of its nice predictive properties. One says that a sequence (X„)„>i, defined on 
(fi, A, P) and taking values in a measurable space (E,£), is conditionally identically distributed with 
respect to a filtration Q — (Q n )n>o (in the sequel, Q-CTD for short), whenever (X n ) n >i is (/-adapted 
and, for each n > 0, j > 1 and every measurable real-valued bounded function / on E, 

E[f(x n+j )\g n ] = E[f(x n+1 )\g n ]. 

This means that, for each n > 0, all the random variables X n +j, with j > 1, are identically distributed 
conditionally on Q n . It is clear that every exchangeable sequence is a CID sequence with respect to 
its natural filtration but a CID sequence is not necessarily exchangeable. Moreover, it is possible to 
show that a (/-adapted sequence (X n ) n >i is (J-CID if and only if, for each measurable real-valued 
bounded function / on E, 

V* := E[f(X n+1 ) | g n ] 

is a (/-martingale, see|] 



Bcrt i. Pratelli and R igo 



( 2004 ). Hence, the sequence (V^f) n >o converges almost 
surely and in L 1 to a random variable Vf. One of the most important features of CID sequences 
is the fact that this random variable Vf is also the almost sure limit of the empirical means. More 
precisely, CID sequences satisfy the following strong law of large numbers: for each real-valued 
bounded measurable function / on E, the sequence (M^) n >i, defined by 

^==^£2=1 /(**). ( 3 ) 

converges almost surely and in L 1 to Vf. It follows that also the predictive mean E[/(X n +i ) \Xi , . . . , X n ] 
converges almost surely and in L 1 to Vf. In other words, CID sequences share with exchange- 
able sequences the remarkable fact that the predictive mean and the empirical mean merge when 
the number of observations diverges. Unfortunately, while, for an exchangeable sequence, we have 
Vf = E[f(Xi)\T] = J f(x)m(uj,dx), where T is the tail-a-field and m is the random directing 
measure of the sequence, it is difficult to characterize explicitly the limit random variable Vf for 
a CID sequence. Indeed no representation theorems are available for CID sequences. See, e.g., 



Aletti. May and Secchi 



( 20071 ) 



The paper is organized as follows. In Section [5] we state our definition of generalized species 
sampling sequence, we discuss some examples and we give a condition under which a generalized 
species sampling sequence is CID with respect to a suitable filtration Q. In Sections [3] and U we 
deal with a particular class of generalized species sampling sequences which are CID: the generalized 
Ottawa sequences (COS for short). We prove that, for a COS, under suitable conditions, the sequence 
y/n(Mji — Vn) converges in the sense of stable convergence to a mixture of Gaussian distributions. 
Moreover, we show that, under suitable conditions, also \fn(yji — Vf) converges in the sense of almost 
sure conditional convergence to another mixture of Gaussian distributions. Both types of convergences 
are stronger than the convergence in distribution. These results are accompanied by two examples. 
In Section [5] we study the length L n of the random partition induced by a GOS at time n, i.e. the 
random number of the distinct values assumed by a GOS until time n. In particular, a strong law of 



3 



large numbers and a stable central limit theorem are presented. This section is also enriched by some 
examples. The paper closes by a section devoted to proofs and by an appendix in which the reader 
can find some results used for the proofs. 



2 Prediction rules which generate a CID sequence 

The Blackwell-MacQueen urn scheme provides the most famous example of exchangeable prediction 
rule, that is 



P{X n+ i e ■ |Xl, . . . ,X„} — Yli=i] 



;<M0 + 



MO 



where 9 is a strictly 



(1973) and 



Pitman 



Blackwell and MacQueen 



+n ' 8 + n 

positiv e parameter and fx is a probability measure, see, e.g 
( 19961 ). This prediction rule determines an exchangeab le sequence (X n ) „>i whose 

Ferguson I (1 19731 ). Accord- 



directing random measure is a Dirichlet process with parameter 9ji{-), see 
ing to this prediction rule, if /i is diffuse, a new species is observed with probability 8/(8 + n) and an 
old species Xj is observed with probability proportional to the cardinality of 7r^ n ' , a sort of preferen- 
tial attachment principle. This rule has its an alogous in term of random partitions in the so-called 

1 20061 ) and the references therein. 



Chinese restaurant process, see 



Pitman 



A randomly reinforced prediction rule of the same kind could work as follows: 

Yi 



P{x n+1 e - \x 1 ,...,x n ,Y 1 ,...,Y n } = Y:ti: 



<M0 + 



(4) 



where jj, is a probability measure and (Y n )«>i is a sequence of independent positive random variables. 
If fi is diffuse, then we have the following interpretation: each individual has a random positive weight 
Yi and, given the first n tags X(n) = (Xi, . . . , X n ) together with the weights Y(n) = (Yi, . . . , Y n ), it 
is supposed that the species of the next individual is a new species with probability 8/(8 + E" =1 Yj) 
and one of the species observed so far, say X* , with probability X]- g (») Yi/(8 + X^=i Yj). Again a 
preferential attachment principle. Note that, in this case, instead of describing the law of (X n ) n >i 
with the sequence of the conditional distributions of X n +i given X(n), we have a latent process 
(Y n )n>i and we characterize (X n ) n >i with the sequence of the conditional distributions of X n +i 
given (X(n),Y(n)). 

Now that we have given an idea, let us formalize what we mean by generalized species sampling 
sequence. Let (fi, A, P) be a probability space and E and S be two Polish spaces, endowed with their 
Borel cr-fields £ and S, respectively. In the sequel, T z = (Tn)n>a will stand for the natural filtration 
associated with any sequence of random variables (Z n ) n >i on (Q.,A,P) and we set J-^ = Vn>a3~n- 
Finally, V n will denote the set of all partitions of {1, ... ,n}. 

We shall say that a sequence (X n ) n >l of random variables on (fl,A,P), with values in E, is a 
generalized species sampling sequence if: 

• (foi) X\ has distribution \x. 

• (/12) There exists a sequence (Y n ) n >i of random variables with values in (S,S) such that, for 
each n > 1, a version of the regular conditional distribution of X n +i given 



J n ■ — */ n v «^ 77 
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is 

^ +1 (a;,o = ELiP^(^ (n) H^(«)H)^a^)(0 + ^4^ (n H^),y(«)H)M(-) (5) 

with p n ,i(-, •) and r„(-, ■) suitable measurable functions denned on V n x S n with values in [0, 1]. 

• (hs) X n+ i and {Y n +j)j>\ are conditionally independent given T n . 

Example 2.1. Let \i be a probability measure on E, (v n ) n >i be a sequence of probability measures 
on S, (r n )n>i and (p„,i) n >i, i<i<« be measurable functions such that 

r n :P n xS n ^ [0, 1], p n ,i : Vn x Z n -> [0, 1] 

and 

ELiP'"^™'^ 1 ' ■ ■ ■ >3/«) + ^n(gn,yi, . . . ,y n ) = 1 (6) 
for each n > 1 and each (q n ,yi, ■ ■ ■ ,y n ) in x S n . By the Ionescu Tulcea Theorem, there are 
two sequences of random variables (X n ) n >i and (Y n ) n >i, denned on a suitable probability space 
(Q,A,P), taking values in E and 5* respectively, such that conditions (hi), (/12) and the following 
condition are satisfied: 

• Y n +i has distribution v n +\ and it is independent of the cr-field 

T n V (j(X n +l) = T n +\ V T n . 

This last condition implies that, for each n, (Y n +j)j>i is independent of Tn+i V T n ■ It follows, in 
particular, that (Y n ) n >i is a sequence of independent random variables. Therefore, also (/13) holds 
true. Indeed, for each real-valued bounded jF n -measurable random variable V, each bounded Borel 
function / on E, each j > 1 and each bounded Borel function h on S J , we have 

E[Vf{X n+1 )h(Y n+1 , Y n+j )] =E[Vf(X n+1 )E[h(Y n+1 ,. . . , Y n +j) \ T n V a(X n +i)] ] 

= HVf{X n+1 )Jh(y n+1 , . . . ,y n +j) v n +i(dy n +i) ■ ■ ■ (dy n +i) ] 

= E[ VE[f(X n+ i) I J r n ]Jh(y n+1 , y n+j ) u n+1 (dy n+1 ) . . . (dy n +i) ] . 

On the other hand, we have 

E[h(Y n+1 ,. . . , Yn+j) I F n ] = JHy-n+i, yn+j) v n +i(dy n +i) ■ ■ ■ (dy n +i) 

hence 

E[f(Xn + l)h(Y n + 1 , Y n +j) \ T n \ = E[/(X n+1 ) | f„]E[h(Y„ +1 , . . . , Yn+j) I Tn]. 

This fact is sufficient in order to conclude that also assumption (/13) is verified. 

In order to state our first result concerning generalized species sampling sequences, we need some 
further notation. Set 

Pnj(i" (n) ) = Pn,j(n (n) ,Y(n)) ~ E ie7r («)Pn,s(7r (n) ,y(n)) for j = l,...,L n 

and 

r„ ~rn(n {n \Y{n)). 

Given a partition 7P n ' , denote by [■K^ n ' > ]j+ the partition of {1, ... , n+1} obtained by adding the element 
(n+ 1) to the j-th block of TY^ n \ Finally, denote by [7P n '; (n+ 1)] the partition obtained by adding a 
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block containing (n + 1) to tt (,1> . For instance, if ?r (3) = [(1, 3); (2)], then [tt (3) ] 2+ = [(1, 3); (2, 4)] and 
[jr< 8 >;(4)] = [(l,3);(2);(4)]. 

Theorem 2.2. A generalized species sampling sequence (X n ) n >i with /i diffuse is a CID sequence 
with respect to the filtration Q = (Q n )n>o with Q n := V^-"^ if and only if, for each n, the following 
condition holds P-almost surely: 



Pn,j(i ) = r n Pn+i,j([n w ;{n + 1}]) + £i=lK+l,j([ 7r ]l+)Pn,l( n ) 
/or 1 < j < L n . 

The next example generalizes the well-known two parameter Poisson-Dirichlet process. 

Example 2.3. Let 6 > and a > 0. Moreover, let /i be a probability measure on E and, (f n )i 
be a sequence of probability measures on (a, +oo). Consider the following sequence of functions 

yi — a/Ci(q n ) 



(7) 



p„,i(q n ,y(n)) 
rn(q n ,y{n)) 



+ a-L(gn) 



where j/(n) = (j/i, . . . ,y„) £ (a, +oo) n , q n £ V n , Ci(q n ) is the cardinality of the block in q n which 
contains i and L(q n ) is the number of blocks of q n . It is easy to see that such functions satisfy (|6]). 
Hence, by Example 12. II there exists a generalized species sampling sequence (X n )n>i for which 



E P wY t -a 

p{x n+1 e • |x(n),y(n)} = Ezt n i fl U v . ^r(-) + 



(8) 



where (y„)„>i is a sequence of independent random variables such that each Y n has law v n . If /x is 
diffuse, one can easily check that (0 of Theorem 12.21 holds and so (Z B )n>l is a CID sequence with 
respect to Q = (jF ? f V ^)n>i- 

It is worthwhile noting that if Y n = 1 for every n > 1 and a belongs to [0, 1], then we get an 
exchangeable sequence directed by the well-known two parameter Poisson-Dirichlet process: i.e. an 
exchangeable sequence described by the prediction rule 



P{X n+1 G ■ |Xl, . . . ,X„} — E^Ti — 



See, e.g. 



Pitman and Yor 



(1993) and 



Pitman 



i + n 
( 2009 ). 



a 6 + «£„ , . 

-°xr{-) + a I ~ ^(-)- 



A special case of the previous example is the randomly reinforced Blackwell- McQueen urn scheme 
([4]). However this prediction rule may be collocated in a more general class of generalized species 
sampling sequences, that are CID. In the next sections, we shall introduce and study this class, called 
"Generalized Ottawa Sequences". 



3 Generalized Ottawa sequences 

We shall say that a generalized species sampling sequence {X n ) n >\ is a generalized Ottawa sequence 
or, more briefly, a COS, if for every n > 1 
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The functions r n and p nyi (i = 1, . . . , n) do not depend on the partition, hence 

if»+i(w 1 -)=E?=iP",i(l r (n)(w))*x 4 ( W )(0+»-»(^(n)(w))A'(-)- (9) 
The functions r„ are striclty positive and 

r„(Yi,...,Y„) > r„ + i(yi,...,F„,y n+ i) (10) 

almost surely. 



The functions p n ,j satisfy 



Pn.i := Pn-i,« for i = 1, . . . , n — 1 

r„-i 

— i _ r " 

p n .n ■ — -L 



(11) 



r ra -i 
with ro — 1. 

For simplicity, from now on, we shall denote by r n and p n ,i the .T^ -measurable random variables 
r n (Y(n)) and p n ,i(Y(n)), that is r„ := r n (Y(n)) and p„,, := p nii (Y(n)). 

First of all let us stress that any GOS is a CID sequence with respect to the filtration Q — 
{Tn V J r ^) n >o- Indeed, since Q n = T n V a(Y n+ j : j > 1), condition (/i3) implies that 

E[f(X n+1 ) | g n ] = E[/(X n+1 ) I T n ] (12) 

for each bounded Borel function / on E and hence, by (h2), one gets 

Vl := E[/(X n+1 ) | g n ] = T,7=iPn,if(Xi) + r»E[/(Xi)]. 

Since the random variables p«+i,j are <5 n -measurable it follows that 

E[V/ +1 | G n ] = +p„+i,„+iE[/(X n+1 ) I g n ] + r n+1 E[/(X0] 

= — Er=iPn, i /(^) + V/ - r»±lv/ +r n+1 E[/(Jf 1 )] 
= - r n+1 E[f(X 1 )] + Vl - r -^V,{ + rn+iE[/(Xi)] = itf . 

Some examples follow. 



Example 3.1. Consider a GOS for which 

Y n — dn 

where (a n ) n >o is a decreasing numerical sequence with ao = 1, a n > and r n (yi, ■ ■ ■ , jm) = ?/n- 

Example 3.2. Let (Yn)n>i be a Markov chain taking values in (0, 1], with Yi = 1 and transition 
probability kernel given by 

P{Y„+i < x-|y n } = ^-I (0 Yn) ( x ) + I [Yn +ac) ( x ) n>l. 

Then we have Y n > Y„+i a.s. for all n > 1. Thus we can consider a GOS with r n (yi, . . . , y n ) = ?/«■ 

As we shall see in the next example, the randomly reinforced Blackwell-McQueen urn scheme 
gives rise to a GOS. 



Example 3.3. Let (ibea probability measure on E, (v n )n>i be a sequence of probability measures 
on S and (r n ), (p-n,i) measurable functions as in (|10[) and Following Example 12.11 there exist 

two sequences of random variables (X n ) n >i and (Y n )n>it defined on a suitable probability space 
(fi, A, P), such that each Y n has law v n and it is independent of T n V^_i and [X n ) n >i follows the 
prediction rule (J5J), i.e. it is a GOS. 

As special case one can consider S = R+ and 

rn(yi,...,»»)= e + ^Uvj 

with e > 0. o 

Particular case of the previous example is the following randomly reinforced Polya urn. 

Example 3.4 (A randomly reinforced Polya urn). An urn contains b black and r red balls, b and 
r being strictly positive integer numbers. Repeatedly (at each time n > 1), one ball is drawn at 
random from the urn and then replaced together with a positive random number Y n of additional 
balls of the same color. For each n, the random number Y n must be independent of the preceding 
numbers and of the drawings until time n. If we denote by X n the indicator function of the event 
{black ball at time n}, then we clearly have E = {0, 1}, 

and 

P{X n+1 G • \X(n),Y(n)} = — — L^-^- ^Yrfx^ (•) + 6 + r ^n^ MO- 

Note that the sequence (X n ) n >i is generally not exchangeable. Indeed, it is straightforward to 
prove that, even if the random variables Y n are identically distributed, the sequence (X n ) n >i is not 
exchangeable (apart from particular cases). 



4 Convergence results for a GOS 



In this section we prove some limit theorems for a GOS under stable convergence and almost sure 
conditional convergence. 

Stable convergence ha s been introduced by 



authors, see, for example, 



Aldous and Eagleson 



Rcnvi 



( 19781 ) 



1 196 3j) an d subsequ e ntly s t udied by various 



Jacod and Memin 



( 198ll l 



Hall and Hevde 



( 1980) . A detailed treatment, includin g some strengthened forms of stable convergence, can be found 



Crimaldi. Letta and Pratelli 



(|2007h . 

Given a probability space (il,A, P) and a Polish space E (endowed with its Borel cr-field £), a 
kernel K on E is a family K = (K(u>, ■)) ul ^n of probability measure on E such that, for each bounded 
Borel function g on E, the function 

K(j,)(v) = fg(x)K(u,ax) 

is measurable with respect to A. Given a sub-cr-field TC of A, we say that the kernel K is 'H-measurable 
if, for each bounded Borel function g on E, the random variable K (g) is measurable with respect to 
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TC. In the following, the symbol N will denote the sub-er-field generated by the P-negligible events 
of A. Given a sub-a-field Ti of A and a Ti V A/"-measurable kernel K on E, a sequence (Z n ) n >i 
of random variables on (fl, A, P) with values in E converges TL-stably to K if, for each bounded 
continuous function g on E and for each ^-measurable real-valued bounded random variable W 

E[g{Z n )W] -^E[K(g) W}. 

If (Z n ) n >i converges H-stably to K then, for each A £ Ti. with P(A) 7^ 0, the sequence (Z n )n>i 
converges in distribution under the probability measure Pa — P(-\A) to the probability measure 
PaK on E given by 

P A K(B) = P(A)" 1 E[/ j4 A"(-,-B)] = / K(u),B)P A (&u) for each B <E £. (13) 

In particular, if (Z n ) n >i converges 7i-stably to K, then (Z„) n >i converges in distribution to the 
probability measure PK on E given by 

PK(B) = E[K(-, B)] = / K(u), B) P(dw) for each B £ £. (14) 

Moreover, if all the random variables Z n are 'H-measurable, then the H-stable convergence obviously 
implies the ,4-stable convergence. 

Given a filtration Q = (G n )n>o and a kernel on E, we shall say that, with respect to Q, the 
sequence (Z n ) n >i converges to K in the sense of the almost sure conditional convergence if, for each 
bounded continuous function g, we have 

E[g(Z n ) I Q n ] — > K(g) almost surely. 

If (Z n )n>i converges to K in the sense of the almost sure co nditional converg ence with respect to a 



filtration Q, then (Z n ) n >i also converges <5oo-stably to K, see ICrimaldi I (|2007f ). 

Throughout the paper, if U is a positive random variable, we shall call the Gaussian kernel 
associated with U the family 

AT(0 ) C/) = (AT(0 ) £/M)) wen 

of Gaussian distributions with zero mean and variance equal to U (w) (with 7V(0, 0) := So). Note that, 
in this case, the probability measure defined in (|13[) and (I14|l is a mixture of Gaussian distributions. 

It is worthwhile to recall that, if (X n ) n >i is a GOS, then it is a CID sequence with respect to the 
filtration Q = (!F n V J-^) n >o (as shown in Section [3} and so the sequence V„ (defined in section O 
converges almost surely and in L 1 to a random variable Vj, whenever / is a bounded Borel function 
on E. Moreover, the random variable Vf is also the almost sure (and in L 1 ) limit of the empirical 
mean 

Ml = ^Et =1 f(x k ). 

We are ready to state the main theorems of this section. 

Theorem 4.1. Let (X n ) n >i be a GOS. Using the above notation, for each bounded Borel function f 
and each n > 1, let us set 

S{ = yfr(Ml - Vl) 
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and, for 1 < j < n, 

<i = j= [f(*i) - jv/ + U - i)v/_J = ^=(1 + m.) [/(*,-) - v£j . 

Suppose that: 

(b) (Zl)* :=su Pl <^J< 3 .|^0. 
Then the sequence (S„)„>i converges A-stably to the Gaussian kernel Af(0,Uf). 
In particular, condition (a) and (b) are satisfied if the following conditions hold: 
(al)Ul^Uj. 
(bl) sup^ERS*) 3 ] <+co. 

Let us see an application of the previous theorem in the next example. 
Example 4.2. Let us consider the setting of Example 1 3 , 3 1 wit h 

6 

ru = 



where 6 > and the random variables Y n are identically distributed with Y n > 7 > and E[Y~^] < 
+00. Given a bounded Borel function / on E we are going to prove that the sequence (Sl) n >i 
(defined in Theorem 14, l|l converges ,4-stably to the Gaussian kernel 

M(Q,A{V P -Vf)), 

where A := Var[Yi]/E[Yi] 2 . 

Without loss of generality, we may assume that / takes values in [0, 1]. Let us observe that, after 
some calculations, we have 

^4fE«W)-- P)] + EM 



8 + T,7=iYi 

If we set b := 9E[f(Xi)] and Yi := F, — E[Yj] = Yi — m, then we can write 

Therefore, since Y^ > 7 and < f(X n ) < 1 for each n, we obtain 

E[(S^) 2 ] < {9 ^ n)2 (E[ (61 + E^i^j) 2 ] + E[ (6 + Er=i^/(X) ) 2 ]) 

- jeT^ {e2 + b2 + 2nYai[Yl]) - c 

where C is a suitable constant. Finally, let us observe that, after some calculations, we get 

ul = ^E?=i [f{*i) - jvf + U - 2 

= ^£7=1 [/(^) + 4^ - BiYifiXi) - V 3 U] \ 

where 

a* =j[(0 + YLJW + Ttl *)] ^ [6 - 9f{Xi) + £ti^/(X)] - 
Si = j[(0 + Eti y *)(« + "El ;n. 
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Hence, we have 

Ul = ±E?=i [fiX,) + (Af)% 2 + B 2 Y 2 f 2 (X 3 ) + (V/^) 2 - 2f(X j )V/_ 1 ] 
+ 1 E; =1 [AfafiXj) - BjYjf 2 (Xj) - A^YjV-_ x - Aj BjY 2 f(Xj) + B^fiX^V^ . 
Recall that we have the following almost sure convergences: 

f q (X n )/n — (for q = 1, 2), V[ — » V, 

^E7=i y / — EpT] (for r = 1, 2), ^^/(X;) — > V/, ±£? =1 / 2 (X,) — V /a . 
^,From the above relations, we get 

B 3 ^ l/E[Yi]. 

In order to study the convergence of ^E7=l^7 f (Xj) for t, q = 1, 2, let us set 
The sequence (Z n )n>i is a martingale with respect to T — (J- n ) n >i such that 

ml] = E; =1 ^e[ (y/rpo) - E[y/r(x,) i ) 2 ] 
<2E[y 1 2 iEr=i^ <+^- 

Therefore, by Kronecher's lemma, we find that 

~E"=i {Yjf q (Xj) - E[Yjf q (Xj) ] .Fj-i]) ^ 0. 
On the other hand, since Yj is independent of Tf V by assumption, we have 

E[y/r(X,) I = E[Y 1 r ]E[/ 9 (X J ) | ^_!] = Ep^]^ ^ E[Y7]V/.. 

Since n _1 Ej=i a j ^i ~~> ac ^ whenever 

a,j > 0, dj ^4" d, ^ _1 E?=i a j a > (15) 



we obatin that 



and so 



In particular, we get 



^E?=ii7/ 9 W)^E[i7]Vf.. 



./ Vf 



Summing up, we have proved that U„ is a sum of terms of the type n _1 Ej=i a jdj, where (aj) and 
(dj) satisfy conditions (|15|l and so we finally get that £/„ converges a.s. to Uj = A(V^2 — V^ 2 ). By 
Theorem 14.11 we conclude that converges .4-stably to the Gaussian kernel JV(0, A(Vy2 — V^ 2 )). 



The second result of this section is contained in the following theorem. 
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Theorem 4.3. Let (X„)„>i be a GOS and f be a bounded Borel function. Using the previous 
notation, for n > set 

Qn := Pn + l.n + l = 1 ~ and Wn : = V^(. V n ~ V f)- 

r„ 

Suppose that the following conditions are satisfied: 

(i) n^2 k>n Q 2 . H, where H is a positive real random variable. 

NE fc > fc 2 E[Q 4 fe ] <°o. 

Then the sequence (W/) n >o converges to the Gaussian kernel 

Af(0,H(V f 2 - Vf)) 

in the sense of the almost sure conditional convergence with respect to the filtrations J- = (JF* V 
Fn)n>a and Q = (T* V J r c ^),i>o- 

In particular, we have 

wl A -^ Uy Af(0,H(V p -V?)). 
Corollary 4.4. Using the notation of Theorem \4-3\ let us set for k > 

1 1 



Pk = 



rk+i r k 
and assume the following conditions: 

(a) r k < c fe a.s. with Efc>o k 2 c\ +l < oo and kr k -4' a, where Cfc, a are strictly positive constants. 

(b) The random variable pk are independent and identically distributed with E[p|] < oo. 
Finally, let us set (3 := E[p|] and h := a 2 /3. 

Then, the conclusion of Theorem \4-3\ holds true with H equal to the constant h. 
Example 4.5. Let us consider the setting of Example 1 3 . 3 1 wit h 

9 

Tk 



+ 



where 9 > and the random variables Y n are identically distributed with Y n > 7 > and E[Y^f] < 
+00. Let us set E[Yi] = m and E^ 2 ] = 8. We have < Ck — 6/(8 + jk) and, by the strong law of 
large numbers, we have 

kr k = — ^ <9/m. 

Furthermore we have 

1 1 n+i 

Pfc = = — 

and so /3 = E[p|] = 5/0 2 . Therefore the above corollary holds with ft = 8/m 2 . Q 



The particular generalized Polya urn discussed in 



Crimaldi 



(2003) (Cor. 4.1) and in 



May. Paganoni and Secchi 



(200a) is included in the above example. 
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5 Random partition induced by a GOS 

Exchangeable species sampling sequences are strictly connected wit h exchangeable random partitions. 



Pitman 



(2006) and the references 



Random partitions have been studied extensively, see, for instance . 
theirin. 

In this section we investigate some properties of the length L n of the random partition induced 
by a GOS at time n, i.e. the random number of distinct values of GOS until time n. 

Let A := E and A n (uj) := E \ {Xi(u), . . . , X n {u>)} = {y € E : y£ {Xi(u),...,X n (u)}} for 
n> 1 and define the following .^-measurable random variable: 

s n := r n (Y(n))fj,(A n ) = r„fi(A„). 

Remark 5.1. Reconsidering the species interpretation, given X(n) = (Xi,...X n ) and Y(n) — 
(Yi, . . . , Y n ), the species of the (n + l)-th individual is a new species with probability s n and one of 
the species observed so far with probability 1 — s n . In particular one has 

P[L n +i = L n + 1 j T n \ =s„ = r n ^{A n ). 

If the probability measure fi is diffuse, then s n — r n . 

If (j, is diffuse and the coefficients r n are deterministic ( such as in Example 13. ip . then the sequence 
of the increments (L n — L n _i) n >i (with Lo := 0) is a sequence of independent random variables such 
that, for each n, the distribution of L n — L n -i is a Bernoulli distribution with parameter r n -i, hence 
it is immediate to deduce, under suitable conditions, both a strong law of large numbers and a central 
limit theorem for (L n ) n >i. 

In this section we prove a law of large numbers and a central limit theorem for a GOS. Moreover, 
some examples of GOS that satisfy the hypotheses of these results are given. 

Theorem 5.2. Let (X n ) n >i be a generalized species sampling sequence. Suppose that there exists a 
sequence {h n ) n >i of real numbers and a random variable L such that the following properties hold: 

h n >0, hnt+OO, E 3 ->i E[83 ' ( ^~ 8j ' )] <+°0, -^Y,?=0 S i^ L - 

Then we have L„/h n ^+ L. 

Remark 5.3. Let us note that, for each n, we have 

E[L n+ i | !F„] — L n + s n > L n . 

Hence the sequence (L n ) n >o is a positive submartingale with E[L n +i] = E[L n ] + E[s n ]. Therefore 
(L„)>o is bounded in L 1 if and only if we have ^2 k>0 E[sfc] < +oo and, in this case, (L n )n>o 
converges almost surely to an integrable random variable. It follows that, for each sequence (h n )n>o 
with h n — * +oo, the ratio L n /h n goes almost surely to zero. An example of this situation is given by 
Example 13. 21 with /i diffuse. Indeed, in this case, we have E[s n ] = E[y n ] = (l/2) n_1 . 

Theorem 5.4. Let {X n ) n >\ be a GOS with fi diffuse and suppose there exists a sequence (h n ) n >i of 
real numbers and a positive random variable a 2 such that the following properties hold: 

, ^ „ l * , 2 5^3 = 1 _ r 'j) a.s. 2 

n„ > 0, h n J +oo, a n := ■ > a . 
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Then, setting R n := X] n =i r j' we have 



L n - R n -! A-stably 2 
T n := = — > Jx (0, g 



Corollary 5.5. Under the same assumptions of Theorem \5.4\ if P(o~ 2 > 0) = 1, then we have 

T n (L n - Rn-i) A-stably 



Example 5.6. Let us consider Example 13.11 with fi diffuse and 



JV(0,1). 



with 6 > and < a < 1. We have s n — r n = a n and, setting h n — n a and L = 0/a, the 
assumptions of Theorem 15.21 are satisfied. Indeed we have 



Moreover, since 



we have 



J3 (6 , +i 1_Q ) 2 i 2Q! 

/ .l-a \2 l 

= E, (^TT^J < + °°- 



— E"-i^ >~ for a G (0,1), (16) 



n a ™ n« 9 + j 1 -" 'a' 

Thus we have L n /n a -—> 8 /a. Finally, since 

tJ-e;=i«a - (17) 

provided that > 0, X^/Li a i/^n - * 1 an d &n — >■ & as n — »• +00, it is easy to see that 



Therefore, by Theorem 15.41 we obtain 



_ L n -R n . x A-stably 

^72 — > TV (0,0). 





Example 5.7. Let us consider the setting of Example 1 3 . 3 1 wit h /i diffuse and 



+ E^i*' 

where > and the random variables Y n are independent identically distributed positive random 
variable with E[F„] = m > 0. Then s n = r n and, setting h n — logn and L = c/m, the assumptions 
of Theorem [S3] are satisfied. Indeed 

Ehd-r,)] 1 
^ (log;/) 2 " ^ (logj) 2 
Moreover, by the strong law of large numbers, we have 
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a.s. 



Therefore, since E7=i J ~ > 1j °y (| 1T[I . we can conclude that 

-}—R = _JLy-« I = _JLy™ -(- + -T J Y- S 

logn n logn^ =1 e + Ei=i^ logn^'^i i^ i=x % ) m 

and so L n /logn 0/m. Moreover, by (|17|l and the strong law of large numbers, we have 

. ^.7 = 1 



logn logn^ J - 1 (6i + ^ 1 F l ) 2 



iog»"'-' V«/.» + K_, k/j / EL, K J 

Therefore, by Theorem 15,41 we obtain 

L„ - i? n _i -^-stably 

T n = — — > M{0,6/m) 

Vlogn 



and so 

in — iin-i .4 -stably 



-V(0,1). 



If we take Yi = 1 for all i, we find the well known results for the asymptotic distribution of the 
length of the random partition obtained with the Blackwell-McQueen urn scheme. Indeed, since 
E™ =1 j" 1 - log n = 7 + O(i), one gets 

L n — 6 log n ^-stably 



_ V^lo" 

See, for instance, pages 68-69 in 



(0, 1) 



Pitman 



(2006). 



6 Proofs. 

This section contains all the proofs of the paper. Recall that 

T n = V Tl and 0„ =^n V^=^„V cr(F n+j : j > 1) 
and so condition (/i3) of the definition of generalized species sampling sequence implies that 

VS := E[ 5 (Jf n+1 ) | 0„] = E[g(X n+1 ) \ T n \ 
for each bounded Borel function g on E. 

6.1 Proof of Theorem Q 

We start with a useful lemma. 

Lemma 6.1. If (X n ) n >i is a generalized species sampling sequence, then we have 

P[n + 1 € nl n+1) | g n ] = P[X n+1 = Xt I T n \ = J2,^PnA^ {n \Y(n)) + r n (7T W ,y(n)) M ({Xr}) 
/or each I = 1, . . . ,L n - Moreover, 

E[I {Ln+1=Ln+1} f(X n+1 ) | g n ] = E[/ {i „ +1=i „ +1} /(X n+1 ) |^ n ] = r n (ir (n) ,Y(n)) j f(y)»(dy). 
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holds true with Aq := E and A n the random "set" defined by 

A„(uj) := E\{X 1 (u),...,X n (u)} = {y G E : y <£ {Xi{u), . . . ,X n (ui)}} forn>l. 
In particular, we have 

P[L n+1 = L n + 1 1 Q n ] = P[L n+1 =L n + l\ F n ] = r n {-K (n \Y{n))fi{A n ) := s n {ix {n \ Y{n)) 
If (i is diffuse, we have 

P[n + le 7T ! (n+1) I g n ] = P[X n+1 = Xt I T n \ = J2 ie ^Pn A^ n) ,Y(n)) 
for each I = 1, . . . , L n and 

ni{L n+1 =L n+ i}f(X n+1 ) | 0„] = E[I {Ln+1=Ln+1} f(X n+1 ) | T n ] = r„(7r (n >, Y(n))E[/(Xi)] 

and 

P[L n+1 =L n + l | 0«] = P[L n+1 =Ln + l\ T n ] = r n {-K (n \ Y(n)). 
Proof. Since Q n — T n V a(Y„+j : j > 1), condition (/13) implies that 

P[n + 1 G 7r i (n+1) I 0„] = P[X n+1 = Xt | 0„]P[X„+i = x; I Tn\ 

Hence, by assumption (h2), we have 

P[X n+1 = Xt\F n ] = Yl?=iPn,i(K (n \Y(n))5 Xi (Xn+rn(Tr (n \Y(n)M{Xn) 
= E 3 ^(")Pn, 3 (^ (n) , Y(n)) + r n (irW,y( n )KW}). 



for each Z = 1, . . . , L„. If /i is diffuse, we obtain 



p[x„ +1 = xr i^„] = £,- 6w (»)P»j(* w .r(»)) 

for each Z = 1, . . . , L„. 
Now, we observe that 

^{L„ + 1 = L„+1} = Xb„ (Xl, • • • j-Xftj-Xn+l) 

where B n = {(xi, . . . , £n+i) : x n +i £ {xi, . . . ,x n }}. Thus, by (ft 3 ) and (h 2 ), we have 

E[J{L n+I =i n +l}/(-X'n+l) I Qn] = E[/{z, n + 1=in + i}/(X n+ i) | T n ] 

= JI Bn (X 1 ,..., X n ,y)f(y)K n+1 (;dy) 

= EtiPnA^ n \Y(n))f A J(y)S Xi (dy) + r n (n( n \Y(n))f A J(yMdy) 
= r n (^ n \Y(n))f An f(y)(i(dy). 

If we take / = 1, we get 

P[U n +i = 1 1 0„] = P[[/„+i = l\T n ]= r n (7T (n \Y{n))ii(A n ). 
Finally, if [i is diffuse, then fj,(A n (ui)) = 1 for each u) and so we have 

f A J(y)»(dy) = E[f(X 1 )]. 

□ 
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Proof of Theorem l2.21 Let us fix a bounded Borel function / on E. Using the given prediction 
rule, we have 

Vi = £feiP»,i(T ( *°,nn))/(*0 +r n (n < - n) ,Y(n))E[f(X 1 )} 
= i:f=M^ n) )f( X I) + r»E[/(Xi)]. 
The sequence (X n ) is Q-cid if and only if for each bounded Borel function / on E, the sequence 
(Vn)n>o is a Q- martingale. We observe that we have (for the sake of simplicity we skip the dependence 
on (Y n ) n >i) 

HVj+i I Gn] = J2tJ( X ') E ' + E[p n+1 , n+1 (7v {n+1) )f(X n+1 ) I g n ] + E[r„+i | g n ]f 

= E^=i /W) E j&r (»)^ + E[ Pn+1 , n+1 (7r ( " +1) )/(X n+1 ) | g n ] + E[r n+ i \ Q n \j 

where = E\p n+lti (^ n+1 ^ \ Q n \ and ] = E\f{X x )\. 

Now we are going to compute the various conditional expectations which appear in the second member 
of above equality. Since fi is diffuse, using Lemma |6. II we have 

E, =E[p n+1 , t (n {n+1) )\g n ] 

= Ef= 1 E[/ {n+le7r (» + i) } p ra+ i, i (7r("+ 1 )) | g n ] + E[/ {Ln+1=i „+ 1} p„ +1 , ! (^+ 1 >) | g n ] 

= E i i>»+i,4[^ (n) ] ! +)E[I {n+l67r („ +1)} I g n ] +E[J {i „ +1=in+1} | 5»]p»+i,i([7rW;n + l]) 

= Ei3tP»+l,4[ 7I ' ( ™ ) ]!+)E J - ew ('«)Pn,i( 7rCTl) ) +T-nPn+l,i([7r (7l) ;n+ 1]) 

= Ef=V«+i.i(k ( "VK,*(7r (n) ) +r„p n+ i, i (k ( " ) ;n+ i]) 

and so 

J2 Ei= J2 p; + i J (k (n) ] ;+ K, i (^ ( " ) )+ E P^M(k (B) ]i+)^(T (n) ) + rnP: + ij([w w ;n + l]) 

(n) l=l,lj£j -r- (") 

J 3 

= Ef=lK+l,i([7T ( ' l) ]i+))'™,!( 7I ' (Tl) ) -P™+l,«+l([ 7r(n) ]i+)Pn+l,i( 7r( " ) ) +rnP™+l,i([l" (n) ;n+ 1]) 

Moreover, using Lemma |6 . 1 1 again, we have 

E[p„ +1 , n+1 (7r (n+1) )/(x n+1 ) i g n ] = 

Y.v[i {n+1 ^+v } Pr,+±, n +^ n+1) )f{x n+1 )\g n ] + w 
1=1 1 

E E [ / { n + i e ^"+ 1 )} I Gn]Pn+l,n+l([^ n) ]l+)f(Xn + W{L n + 1 =L n + l}f(X n+1 ) | 0»Wl,„+l([jT (n) ]; U + 1) = 
(=1 ' 

([* (n) ] I+ )/(**) + 

([ 7 r ( ™ ) ];n + l)/ = 

Ef=lPn,l( 7i " (n) )Pn + l,« + l([ 7! " (n) ]i+)/(^*) +r n p n + l, n+ l([7T (,l) ];71+ 1)/. 

Finally we have 

E[r n+1 | 5„] = 1 - E[p n+1 , l (7r("+ 1 )) | g n ] 

= 1 — E"=i ^ — 

= 1 - ELi^ " Et"K,«(^ W )Pn + i,n+i(k (n) ] I+ ) - r„p„ + i,„ + i(brW];n + 1) 
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Thus we get 
where 



= E^W^ +Pn + l,«+l([7T C " ) ], + K J (7rW) 



= r„p* +M ([7r ( " ) ;n+ 1]) + Ei=iP«+i,i([ lr ]i+)Pn,i( ,r ) 
We can conclude that (X n )n>i is 5-cid if and only if we have, for each bounded Borel function / on 
E and each n 

Hj=iPn,jf( x I) + r ™/ = E^=i Cn,jf(Xj) + (1 - E^i c n.j)/ P-almost surely. 

Since E is a Polish space, we may affirm that (X n ) n >i is CJ-cid if and only if, for each n, we have 
P-almost surely 

But this last equality holds if and only if, for each n, we have P-almost surely 

Pn.j = Cn,j for 1 < j < L„ ; 

that is 

Pn,j(7r (n) ) = r n p* n+liJ ({ir {n) ;{n + 1}]) + Ef=iK+i,j(br (n) ]i+, )Pn,l(*" (n) ) 
This is exactly the condition in the statement of the Theorem 12.21 

□ 

6.2 Proofs of Section H 

Proof of Theorem 14.11 We will use Theorem IA. 21 in the Appendix. For each n > 1, let us set 

D s n = V^(M* - V f ) = 4= K=i/(*h) " »»Vf] , 

and, for < j < n, 

^J=E[i^|0 J -] ^nj=ft. 
Then, for each n > 1, the sequence (L n ,j)o<j<n is a martingale with respect to (J- n ,j)o<j<n such that 
L n , = E[D^|e? ] = and, for 1 < j < n, 

Li, 3 - = E[A{ I &] - E[Dl | = Zi 4 . 

Indeed, using (|12|l we have 
E[d£ | - E[D f n | 

= ^= [EU./OT + (n - j)V/ - n^/ - EU/(^) - (n - i + + nV 3 U] 

Moreover, we have 

= Epi i e n ] = l{,„ = e;=i <i- 
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Finally, we have 

Hj = liminf„ F n ,jAn = liminf„ Q jAn = Qj 

and, if we set 

n = Vj>o^ = Vj>o^. 

then the random variable Uf is measurable with respect to the a-field H. V N ■ At this point we can 
apply Theorem IA.2I and the proof of the first assertion is concluded. 

If conditions (al) holds, then condition (a) is obviously verified. Moreover we have 
where 

Zf = f{Xi)-jV* + {j-l)V?_ 1 . 

We can write 

This fact implies that 

(Zly =sup 1 < i < n |Z^.| ^0, 

Indeed, 

SU P <j<n(^n,j) 2 = ^ S UP0< J <n( 2 '/) 2 ^ 0. 

Further, we have 

E[((A^r) 2 ] =E[su Pl < J <„« J ) 2 ] < E? =1 E[«,) 2 ] = E"=iE[«, -Li^f] 

= e; =1 e[(l,; j ) 2 ] -e^^) 2 ] =E[(iU 2 ] =n(sin 

^From (bl) and the above relations, we obtain that the sequence ((■£,{ )*) n is bounded in L 2 and so 
we get condition (b). □ 
Proof of Theorem 14.31 Without loss of generality, we may assume |/| < 1. It will be sufficient 
to prove that the sequence (V„ )„>o satisfies conditions (a) and (b) of Theorem IA.3I with U = 
H(Vf2 — Vf). To this end, we observe firstly that, after some calculations, we have 

Vi - V k f +1 = [V* - f(X k+1 )] Q k . (18) 

/.From this equality we get \V k — V k+1 \ < Qk, and so, using assumption (ii), we find 

SUPfc k 2 \v{ - vi +1 \ 4 < E fc > k 2 Qi e L\ 

Furthermore, by (|18|l . we have 

E fc >»(V/ - V k f +1 f = E fc > ?l [V/ - f(X k+1 )] 2 Ql for n - +oo. 
Therefore, in order to complete the proof, it suffices to prove, for n — > +oo, the following convergence: 
nJ2 k>n [Vi - f(X k+1 )] 2 Q 2 ^ H(V P - V 2 ). 
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The above convergence can be rewritten as 

™E fc >„ [(V k f f + .f(X k+1 ) - 2V k f f(X k+1 )] Ql ^ H(V P - Vf). (19) 

f f 2 

Now, by assumption (i) and the almost sure convergence of (V k ) k to Vf and of (V k ) k to Vf2, we 
have 

n^ k > n V k f Ql^V f H (20) 
nT. k > n {VlfQl^{V f fH (21) 
nJ2 k > n V k f2 Ql^V f2 H (22) 
Thus, it will be enough to prove the following convergence: 

™E fc >„ [g(Xk+i) - vi\ Ql ^ (23) 

where g is a bounded Borel function with |<?| < 1. Indeed, from (|23[) with g = f 2 and (|22f) . we obtain 

nJ2 k>n f(X k+1 )Ql^V f 2H, (24) 



Moreover, from (|23p with g = / and (|20[) . we obtain 

«E fc >„/(^+i)0 2 fc ^^^ (25) 
and so, by the almost sure convergence of (V k ) k to Vf, we get 

™E fe >„ ttf /(X fc+1 ) Q 2 fe ^ (V,) 2 !/, (26) 
Then convergence relations (|21[) . (J24J) and (|26[) lead us to the desired relation Q19JI. 
In order to prove (|23[l. we consider the process (Z n ) n >o defined by 

Zn:= J2 n k Zok[g(X k+1 )-V k 3 ] Ql 

It is a martingale with respect to the filtration Q = (Q„) n >o- Moreover, by assumption (ii), we have 

HZ 2 n ) = E^o 1 k 2 E[(g(X k+1 ) - V*f Qt] < E fc > k 2 E[Q 4 fc ] < oo. (27) 

The martingale (Z n ) n >i is thus bounded in L 2 and so it converges almost surely; that is, the series 

J2 k > k[g(X k+1 )-V°] Q 2 

is almost surely convergent. On the other hand, by a well-known Abel's result, the convergence 
of a series ~}2 k a k , with a k £ R, implies the convergence of the series '^2, k k~ 1 a k and the relation 
n Efc> n k~ a,k — > for n — > +oo. Applying this result, we find (|23p and the proof is so concluded. □ 
Proof of Corollary 14.41 It will suffice to verify that condition (i) and (ii) of Theorem 14. 3 1 hold 
with H = h. With regard to condition (ii), it is enough to observe that, by the obvious inequality 
Qk = ffc+iPfc < Cfe+iPfe and the identity in distribution of the random variables p k , we have 

E fe > k 2 E[Qi] < E fc>0 k 2 4 +1 E[pi] = E[pg] E fc > k 2 4 +1 < oo. 
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In order to prove condition (i) of Theorem 15.41 (with H = h), we observe that the series 

is almost surely convergent: indeed, the random variables Z k := fc -1 (p\—0) are independent, centered 
and square-integrable, with Var[Zfc] = k~ 2 Vax[pi]. Therefore, by the above mentioned Abel's result, 
we obtain the almost sure convergence of the series 

J2 k k- 2 (pl-/3) 

and the relation (for n — > +00) 

-E fe >„fc- 2 (P?--/3)^o. 

Since we have n^2 k>n k~ 2 — > 1 for n — > +00, the above relation can be rewritten in the form 
Now we observe that[*| 

Q2 2 2 a. a. 2,-2 2 

k = r k+1 p k ~ Q k p k . 

Hence, for n — > +00, we have 

fe >„ Qk ~ a n E fe >„ fc Pfc — >a /3 = h. 
Condition (i) of Theorem 14.31 (with H = h) is thus proved and the proof is concluded. □ 

6.3 Proofs of Section [5] 

In order to study the asymptotic behavior of {L n )n>i it will be useful to introduce the sequence of 
the increments 

Ui :— L\ — 1 and U„ := L n — L n _i for n > 2. 
Clearly (U n )n>i is a sequence of random variables with values in {0, 1} such that, for each n > 1, the 
random variable U n is Tn -measurable and L n = E"=i 

Proof of Theorem 15.21 Without loss of generality, we can assume h n > for each n. Let us 

set 

Z :=0 Z n :=E"=i(^-»i-i)Ai- 
Then Z = (Z n )n>o is a martingale with respect to the filtration T = (^)„>o- Indeed, by Lemma 
16.11 we have 



E[Zn+l — Z n \ Tr\ = E[(7 n +1 — S n I ,F n ] = E[/{L J1 + 1= L„ + 1} — S n | Tn\ = 0. 

Moreover, we have 

E[U n+ i] = P(L n+1 = L n + 1) = E[s„] 

and 

E[(J7„+i -s„) 2 ] =E[E[(C/„+i -sn) 2 ]^]] 

= E[(l - s n fs n + s 2 (1 - s n )] = E[s„(l - s n )]. 



1 Given two sequences (a n ), (6„) of random variables, the notation a n a ~' b n means that ^ °— + 1. 
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Therefore we obtain 



nzl] = E"=iE[(^ - s^) 2 ]/h 2 = E7=i E h-i(! - 

and so 

sup„E[^] < Ei>iE[>i-i(l-«i-i)]/fc? < +00. 
It follows that (Z n ) n >i converges almost surely and, by Kronecker's lemma, we get 

1 1 as 

T- {Ln ~ E7=l S 3'-l) = T-E"=l(^ ~ S J-l) ^ °- 

Therefore, since E™=i 8j-i/h n = E™=o s j/h n — s n /h n —> L, we obtain L„/h n — ^ L. □ 
In order to prove Theorem l5.4l we need a preliminary lemma. 

Lemma 6.2. If {X n ) n >i is a GOS with n diffuse, then (with the previous notation), for each fixed 
k, a version of the conditional distribution of {Uj)j>k+i given Qk is the kernel Qk so defined: 

where B(l,rj-i{ui)) denotes the Bernoulli distribution with parameter rj-i(uS). 

Proof. It is enough to verify that, for each n > 1, for each e^+i, . . . , tk+n G {0, 1} and for each 
C/fc-measurable real-valued bounded random variable Z, we have 

V[Zhv k+1 = Ck+1 ,..,v k+n = Ck+n} ) = E[2n^i(l -rj-i) 1 "^]. (28) 
We go on with the proof by induction on n. For n = 1, by Lemma |6. II we have 

V[ZI{v k+1 =e k+1 }] = E[ZE[I {Uk+1=ek+l} I g h ] ] = E[Zr k ^ (1 - rk) 1 -'^]. 



Assume that (|28[) is true for n—1 and let us prove it for n. Let us fix an <5fc-rneasurable real-valued 
bounded random variable Z. By Lemma l6.ll we have 

E [ ZI {U k+1 =e k+1 ,,..,U k+ „=e k+ „}] = E[ZI{U k+1 =e k+1 ,...,U k+n _ 1 =e k+n _ 1 }'&[Uk+n = £fe+n | Qk+n-l]] 

= E[£r£+ n %(l-r fc+ „_i) 1 - e ^ 

We have done because also the random variable Zi"k+£-i(l — rk+ n -i) 1 ~ ek+ " is (^-measurable and 
is true for n—1. □ 



Proof of Theorem 15.41 Without loss of generality, we can assume h n > for each n. In order 
to prove the desidered ,4-stable convergence, it is enough to prove the T^, V J-^-stable convergence 
of (T„) to Af(0, a 2 ). But, in order to prove this last convergence, since we have V J-^ = \J k Qk, 
it suffices to prove that, for each k and A in Qk with P(A) 7^ 0, the sequence (T n ) converges in 
distribution under Pa to the probability measure PaN(Q, a ). In other words, it is sufficient to fix 
k and to verify that (Tk+ n )n (and so (T n ) n ) converges (Jfc-stably to Af(0,a 2 ). (Note that the kernel 
AT(0, a 2 ) is Qk V A/"-measurable for each fixed k.) To this end, we observe that we have 

J-k + n — 1 — 1 H 



\hk+n \/hk+n \fh 
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AT(0,a 2 ). (29) 



Obviously, for n — > +00, we have 

— * °- 

Therefore we have to prove 

J2j=k+i(Uj -Tj-i) g k -stably 

\J hk + n 

From Lemma 16.21 we know that a version of the the conditional distribution of (Uj)j>k+i given 
Qk is the kernel Qk so defined: 

Q*(w,-) = ®£* +1 B(Vi-i(a;)). 

On the canonical space R N let us consider the canonical projections (£j)j>&+i- Then, for each n > 1, 
a version of the conditional distribution of 



yhk+n 

given Qk is the kernel Nk+ n so characterized: for each ui, the probability measure Nk+ n {u> •) is the 
distribution, under the probability measure Qk(w, ■), of the random variable (which is defined on the 
canonical space) 

On the other hand, for almost every u>, under Qk(u, •)> the random variables 
Z nj , := - for n > 1, 1 < i < n 

\/hk + n 



form a triangular array which satisfies the assumptions of Theorem I A. II in the Appendix. Indeed, we 
have the row-independence property and 

E^^[z n>i ) = 0, E«*< w '->[<i] = r J±izl^±ZIh±izl^A . 

hk + n 

Therefore, by assumption, for n — > +00, we have for almost every uj, 

Ei=l E l^n.ij = 7 = CTfc+n(^) 7 > <T 



Moreover, under Qk(u>,-), we have Z* := sup^ Z n ,i < 2/y/hk+n — ► 0. Finally, we observe that, 
setting V n := Z)"=i ^n,i> we have 

with 

< 4 ( al +n (uJ 



/ifc-io- 2 _!(o;)\ 1 



Since, for almost every u>, the sequence (a 2 (a;)),! is bounded and h n | +00, it follows that, for almost 
every uj, the sequence (V n )n is bounded in L 2 under Qk{ui, •) and so uniformly integrable. Theorem 
I A. II assures that, for almost every uj, the sequence of probability measures 

(JVn-n(w,0) 
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weakly converges to the Gaussian distribution Af(0, er 2 (w)). This fact implies that, for each bounded 
continuous function q, we have 



E 



y/hk+n 

It obviously follows the (Jfc-stable convergence (f29|) 



Gk 



□ 



A Appendix 



For the reader's convenience, we state some results used above. 

Theorem A.l. Let (2n,i)n>i, kku be a triangular array of square integrable centered random 
variables on a probability space (fi, A.,P). Suppose that, for each fixed n, (Z n ^)i is independent 
( "row-independence property"). Moreover, set 

— 1 °^n,i 5 



E[Z^i] = Var[Z„, 4 ], al 



•2 



sup 



and assume that (V„)„>i is uniformly integrable, Z* — > and o~\ 
in law 



Then E ■=! Z, 
Proof. In 



'N{Q,o 2 ). 



Hall and Hevde 



( 19801 ) (see pp. 53-54) it is proved that, under the uniform inte- 



Hall and Hevde 



(1980) with 



grability of (V„), the convergence in probability to zero of (ZX)n>i i s equivalent to th e Lindeberg 
condition. Hence, it is possible to apply Corollary 3.1 (pp. 58-59) in 

J~n,i — 0~(Z n ,l, . . . , Zn^i). 

Theorem A. 2. (See Th. 5 and Cor. 7 of sec. 7 in 



Crimaldi. Letta and Pratelli 



2(W 



)) 



Let (l n )n>i be a sequence of strictly positive integers. On a probability space (Q,A,P), for each 
n > 1, let (J r „,j)o<j<i„ be a filtration and (i n j )n>i,o<j<i„ be a triangular array of real random 
variables such that, for each n, the family (I/ n j)o<j<i„ is a martingale with respect to {J r n,j)o<j<i n 
and L n> o = 0. For each pair (n,j), with n > 1, 1 < j < l„, let us set Z n j = L n j — L„j-± and 



Sn — y / ^ — x Zn, 



Ln.i 



Un — 5Zj = i Zn,j, Z* — SUp-,^^ \Z n j\. 

Let us suppose that the sequence (U n )n>i converges in probability to a positive random variable U 
and the sequence (Z*) n >i converges in L 1 to zero. Finally, let M be the sub-a -field generated by the 
P -negligible events of A and let us set 

Hj = liminf„ T n , jMn forj > 0, H = Vj> Hj. 

IfU is measurable with respect to the a-field TiVA/", then (S n )n>i converges Ti-stably to the Gaussian 
kernel J\f(0,U). 

Theorem A. 3. (see Crimaldi, 2007) 

On (Q,, A., P), let (Vn)n>o be a real martingale with respect to a filtration Q = (Gn) n >o- Suppose that 
(Ki)n>o converges in L 1 to a random variable V. Moreover, setting 



Un:=nJ2 k> (V k -V, 



k + l) 



Z := sup fc \fk \V k - V fe+ i|, 



(30) 
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assume that the following conditions hold: 

(a) The random variable Z is integrable. 

(b) The sequence (U n )n>i converges almost surely to a positive real random, variable U. 
Then, with respect to Q, the sequence (W n )n>i defined by 

W n ~ s/n(V n - V) (31) 

converges to the Gaussian kernel A/"(0, U) in the sense of the almost sure conditional convergence. 

Obviously the previous almost sure conditional convergence also holds with respect to any filtra- 
tion T' such that C T' n C Q n . 
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